DANN: a deep learning approach for annotating the pathogenicity of genetic variants

نویسندگان

  • Daniel Quang
  • Yifei Chen
  • Xiaohui Xie
چکیده

UNLABELLED Annotating genetic variants, especially non-coding variants, for the purpose of identifying pathogenic variants remains a challenge. Combined annotation-dependent depletion (CADD) is an algorithm designed to annotate both coding and non-coding variants, and has been shown to outperform other annotation algorithms. CADD trains a linear kernel support vector machine (SVM) to differentiate evolutionarily derived, likely benign, alleles from simulated, likely deleterious, variants. However, SVMs cannot capture non-linear relationships among the features, which can limit performance. To address this issue, we have developed DANN. DANN uses the same feature set and training data as CADD to train a deep neural network (DNN). DNNs can capture non-linear relationships among features and are better suited than SVMs for problems with a large number of samples and features. We exploit Compute Unified Device Architecture-compatible graphics processing units and deep learning techniques such as dropout and momentum training to accelerate the DNN training. DANN achieves about a 19% relative reduction in the error rate and about a 14% relative increase in the area under the curve (AUC) metric over CADD's SVM methodology. AVAILABILITY AND IMPLEMENTATION All data and source code are available at https://cbcl.ics.uci.edu/public_data/DANN/.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genome analysis DANN: a deep learning approach for annotating the pathogenicity of genetic variants

Summary: Annotating genetic variants, especially non-coding variants, for the purpose of identifying pathogenic variants remains a challenge. Combined annotation-dependent depletion (CADD) is an algorithm designed to annotate both coding and non-coding variants, and has been shown to outperform other annotation algorithms. CADD trains a linear kernel support vector machine (SVM) to differentiat...

متن کامل

KCNE1 and KCNE2 variants in Patients with Long QT Syndrome

Introduction: Long QT syndrome (LQTS) is a type of ventricular arrhythmia characterized by prolonged QT intervals on electrocardiogram or delay in ventricular repolarization and it can lead to syncope, seizure and sudden cardiac death. Here, KCNE1 and KCNE2 variants are studied among Iranian affected families with this syndrome. Materials and Methods: Fifty patients referring to Rajaei Cardiov...

متن کامل

تأثیر آموزش راهبردهای خود تنظیمی بر رویکردهای یادگیری دانش آموزان اول دبیرستان

Abstract The present study was conducted to determine the effect of learning self-regulation strategies on surface, deep and strategic learning approaches of high school first grade female students in Yazd. The study method was pre-test and post-test design. For this purpose, a sample size of 57 subjects was selected by multistage cluster sampling method among high school first grade female ...

متن کامل

Detection of children's activities in smart home based on deep learning approach

 Monitoring behavior of children in the home is the extremely important to avoid the possible injuries. Therefore, an automated monitoring system for monitoring behavior of children by researchers has been considered. The first step for designing and executing an automated monitoring system on children's behavior in closed spaces is possible with recognize their activity by the sensors in the e...

متن کامل

The Relationship of Study and Learning approaches with Students’ Academic Achievement in Rafsanjan University of Medical Sciences

Introduction: Most experts consider learning approach as the fundamental basis of learning dividing it into two parts of deep learning approach and surface learning approach. This is an endeavor to investigate the relationship between learning and study approaches with academic achievement among students in Rafsanjan University of Medical Sciences. Methods: This descriptive cross-sectional stu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 31 5  شماره 

صفحات  -

تاریخ انتشار 2015